Testing the National Software Reference Library
نویسنده
چکیده
The National Software Reference Library (NSRL) is an essential data source for forensic investigators, providing in its Reference Data Set (RDS) a set of hash values of known software. However, the NSRL RDS has not previously been tested against a broad spectrum of real-world data. The current work did this using a corpus of 36 million files on 2337 drives from 21 countries. These experiments answered a number of important questions about the NSRL RDS, including what fraction of files it recognizes of different types. NSRL coverage by vendor/product was also tested, finding 51% of the vendor/product names in our corpus had no hash values at all in NSRL. It is shown that coverage or “recall” of the NSRL can be improved with additions from our corpus such as frequently-occurring files and files whose paths were found previously in NSRL with a different hash value. This provided 937,570 new hash values which should be uncontroversial additions to NSRL. Several additional tests investigated the accuracy of the NSRL data. Experiments testing the hash values saw no evidence of errors. Tests of file sizes showed them to be consistent except for a few cases. On the other hand, the product types assigned by NSRL can be disputed, and it failed to recognize any of a sample of virus-infected files. The file names provided by NSRL had numerous discrepancies with the file names found in the corpus, so the discrepancies were categorized; among other things, there were apparent spelling and punctuation errors. Some file names suggest that NSRL hash values were computed on deleted files, not a safe practice. The tests had the secondary benefit of helping identify occasional errors in the metadata obtained from drive imaging on deleted files in our corpus. This research has provided much data useful in improving NSRL and the forensic tools that depend upon it. It also provides a general methodology and software for testing hash sets against corpora. Published by Elsevier Ltd.
منابع مشابه
Tool Testing Documents
Introduction NIST�s Information Technology Laboratory has two ongoing research projects in computer forensics: the National Software Reference Library (NSRL) project and the Computer Forensics Tool Testing (CFTT) project. Funded through NIST�s Office of Law Enforcement Standards (OLES) by the National Institute of Justice (NIJ), the Federal Bureau of Investigation (FBI), the Department of Defen...
متن کاملSiddhartha: A Technique for Developing Domain-Specific Testing Tools
Existing specification-based software testing tools are difficult to use in many real-world application domains. This occurs when a tool’s design is constrained along dimensions (which are often theoretically-oriented) which are incompatible with the constraints at work in the application domain (which are often business-oriented). When this problem occurs and the benefits to the software organ...
متن کامل1 . 1 Perl based framework for distributed processing
The National Software Reference Library (NSRL) of the U.S. National Institute of Standards and Technology (NIST) collects software from various sources and publishes file profiles computed from this software (such as MD5 and SHA-1 hashes) as a Reference Data Set (RDS) of information. The RDS can be used in the forensic examination of file systems, for example, to speed the process of identifyin...
متن کاملThe design and use of reference data sets for testing scientific software
A general methodology for evaluating the accuracy of the results produced by scientific software has been developed at the National Physical Laboratory. The basis of the approach is the design and use of reference data sets and corresponding reference results to undertake black-box testing. The approach enables reference data sets and results to be generated in a manner consistent with the func...
متن کاملAnalyzing the Role of Public Libraries Services in the Development of the Local Economy: Case Study of Kermanshah Province
Purpose: This study aimed to analyze the role of public libraries in the development of the local economy from the perspective of librarians working in public libraries of Kermanshah province in the current and desirable situation of public libraries. This study extracts the activities that the public library can do in the development of the local economy in the form of information and informat...
متن کامل